Genetic Algorithm Based Text Categorization Using OLEX Method

نویسنده

  • S. Manjula
چکیده

The system describes new similarity-based genetic algorithm (GA) and thresholding Strategies (R&SCut variants). GA was designed to give appropriate weights to terms according to their semantic content and importance by using their co-occurrence information and the discriminating power values for similarity computation. After investigating the existing common thresholding strategies, design multiclass text categorization in which documents may belong to variable numbers of categories.The proposed System conducted extensive comparative experiments on two standard text collections (the Reuters-21578 and the 20Newsgroups). The experimental results using a standard evaluation method, F1, for micro and macro-averaged performance. The results show that GA and R&SCut variants work better than other widely used techniques. Keywords— Genetic algorithm, Olex method, Classification, Text categorization

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

Olex a booming Approach for Text categorization

The growth of information is enormous and handling of such information is also become an issue for the researchers. Several algorithms are proposed to handle the information in a better way. OLEX booming law knowledge for Text arrangement, and olex learning problem is stated as an optimization problem relying on the f-measure as the objective function. In this study we focused on biological inf...

متن کامل

A Genetic Algorithm for Text Classification Rule Induction

This paper presents a Genetic Algorithm, called Olex-GA, for the induction of rule-based text classifiers of the form “classify document d under category c if t1 ∈ d or ... or tn ∈ d and not (tn+1 ∈ d or ... or tn+m ∈ d) holds”, where each ti is a term. Olex-GA relies on an efficient several-rules-per-individual binary representation and uses the F -measure as the fitness function. The proposed...

متن کامل

Cluster Based Hybrid Niche Mimetic and Genetic Algorithm for Text Document Categorization

An efficient cluster based hybrid niche mimetic and genetic algorithm for text document categorization to improve the retrieval rate of relevant document fetching is addressed. The proposal minimizes the processing of structuring the document with better feature selection using hybrid algorithm. In addition restructuring of feature words to associated documents gets reduced, in turn increases d...

متن کامل

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011